5 Archaeobotany
The results presented here are preliminary and the chapter has yet to be written.
In this chapter, I will present the macrobotanical data from 170 case studies used to carry on this research (Chapter 3), along with the quantifications performed on the absolute counts. The data will be first presented temporally, and a discussion of the diachronic trends will follow at the end of the chapter.
5.1 Case studies
The following map shows the sites under investigation, divided by chronology. Please select the desired chronology (or chronologies) from the legend on the right.
R = Roman, LR = Late Roman, EMA = Early Middle Ages, Ma = 11th c. onwards5.2 Ubiquity
In Chapter 4 ubiquity has been described as the best way to present the archaeobotanical remains from the Italian peninsula, given the numerous biases in the samples. The heatmap below (Figure 5.2) provides a good overview of the temporal trends of presence of cereals, legumes, fruits and nuts in the entire area under examination.
Show the code
# Load the libraries
# Note: these libraries are used for the data visualizations in this page.
library(RColorBrewer)
library(reshape2)
library(ggplot2)
library(hrbrthemes)
library(plotly)
library(patchwork)
## UBIQUITY
## Creating a dataframe that contains the ubiquity of each century under examination.
Ubiquity_table <- data.frame(
"I BCE" = archaeobotany_tables(plants_export, -1)$Ubiquity_exp,
"I CE" = archaeobotany_tables(plants_export, 1)$Ubiquity_exp,
"II CE" = archaeobotany_tables(plants_export, 2)$Ubiquity_exp,
"III CE" = archaeobotany_tables(plants_export, 3)$Ubiquity_exp,
"IV CE" = archaeobotany_tables(plants_export, 4)$Ubiquity_exp,
"V CE" = archaeobotany_tables(plants_export, 5)$Ubiquity_exp,
"VI CE" = archaeobotany_tables(plants_export, 6)$Ubiquity_exp,
"VII CE" = archaeobotany_tables(plants_export, 7)$Ubiquity_exp,
"VIII CE" = archaeobotany_tables(plants_export, 8)$Ubiquity_exp,
"IX CE" = archaeobotany_tables(plants_export, 9)$Ubiquity_exp,
"X CE" = archaeobotany_tables(plants_export, 10)$Ubiquity_exp,
"XI CE" = archaeobotany_tables(plants_export, 11)$Ubiquity_exp
)
# Transform the ubiquity table into a matrix
Ubiquity_mat <- as.matrix(Ubiquity_table)
# Rename the centuries
colnames(Ubiquity_mat) <- c("1st c. BCE", "1st c. CE", "2nd c. CE",
"3rd c. CE", "4th c. CE", "5th c. CE",
"6th c. CE", "7th c. CE", "8th c. CE",
"9th c. CE", "10th c. CE", "11th c. CE")
# The data has to be molten to use it with ggplot2
# (package: reshape2)
Ubiquity_melt <- melt(Ubiquity_mat)
# Let's now rename the columns
colnames(Ubiquity_melt) <- c("Taxon", "Century", "Ubiquity")
# Add a column for the text tooltip
Ubiquity_melt <- Ubiquity_melt %>%
mutate(text = paste0("Taxon: ", Taxon, "\n", "Century: ", Century, "\n", "Value: ",round(Ubiquity,2)))
# Create the heatmap with ggplot2
Ubiquity_HM <- ggplot(Ubiquity_melt, aes(Century, Taxon, fill=Ubiquity, text=text)) +
geom_tile(colour="white") +
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "right",
axis.ticks = element_blank(),
axis.text.x = element_text(angle = 90, hjust = 0)
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Ubiquity",
subtitle="Diachronical heatmap of recorded plant species"
) +
scale_fill_gradient(low = "white", high = "black")5.2.1 Macroregional differences
The heatmap displayed in Figure 5.2 presents diachronical ubiquity values of the entire peninsula. However, it is also possible to look at the macroregional differences in plants ubiquities. The R function Ubiquity_macroreg_chrono() (Section 1.3) was created to subset data related to (current) Northern, Central and Southern Italian regions. Subsetting the dataset required a larger chronological division to obtain enough sites for a statistical interpretation of the results. The ubiquity values are presented using the variable Chronology rather than the individual centuries. For a clearer reading of the plot, the taxa have been divided into–Cereals, Pulses and Fruits/Nuts. Some taxa have been omitted from the plot.
Show the code: data preparation
# Ubiquity by Italian Macro regions: Northern, Central and Southern Italy
# Load the libraries
library(vegan)
library(matrixStats)
library(patchwork)
# Creating a dataframe with the ubiquities of all macroregions and chronologies
bot_macroreg <- rbind(
Ubiquity_R_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "R"),
Ubiquity_R_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "R"),
Ubiquity_R_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "R"),
Ubiquity_LR_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "LR"),
Ubiquity_LR_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "LR"),
Ubiquity_LR_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "LR"),
Ubiquity_EMA_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "EMA"),
Ubiquity_EMA_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "EMA"),
Ubiquity_EMA_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "EMA"),
Ubiquity_Ma_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "Ma"),
Ubiquity_Ma_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "Ma"),
Ubiquity_Ma_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "Ma")
)
# Re-arranging the cereals/macroregions for visualisation on the Y axis
level_macroreg_order <- c("Southern Italy", "Central Italy", "Northern Italy")
level_cereals_order <- c("Common.Wheat", "Barley", "Rye",
"Einkorn", "Emmer", "Proso.millet",
"Foxtail.millet", "Oats", "Sorghum")
# Cereals
cer_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.R <- filter(cer_ubiquity_macroreg.R, Macroregion!="Central Italy")
cer_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.Ma <- filter(bot_macroreg, (Chronology=="Ma" & Plant.Type=="Cereals"))
cer_ubiquity_macroreg.Ma <- filter(cer_ubiquity_macroreg.Ma, Macroregion!="Southern Italy")
#Pulses
puls_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.R <- filter(puls_ubiquity_macroreg.R, Macroregion!="Central Italy")
puls_ubiquity_macroreg.R <- filter(puls_ubiquity_macroreg.R, Plant!="Chickpea")
puls_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.LR <- filter(puls_ubiquity_macroreg.LR,
Macroregion!="Southern Italy")
puls_ubiquity_macroreg.LR <- filter(puls_ubiquity_macroreg.LR, Plant!="Chickpea")
puls_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.Ma <- filter(bot_macroreg, Chronology=="Ma" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.Ma <- filter(puls_ubiquity_macroreg.Ma,
Macroregion!="Southern Italy")
#Fruits (+ Subset)
fnuts_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.R <- subset(fnuts_ubiquity_macroreg.R, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.R <- filter(fnuts_ubiquity_macroreg.R, Macroregion!="Central Italy")
fnuts_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.LR <- subset(fnuts_ubiquity_macroreg.LR, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.EMA <- subset(fnuts_ubiquity_macroreg.EMA, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.Ma <- filter(bot_macroreg, Chronology=="Ma" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.Ma <- subset(fnuts_ubiquity_macroreg.Ma, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.Ma <- filter(fnuts_ubiquity_macroreg.Ma, Macroregion!="Southern Italy")Show the code: plots
# Cereals plots ubiquity
cer_ubiquity_macroreg_R.HM <- ggplot(cer_ubiquity_macroreg.R, aes(
factor(Macroregion, levels=(level_macroreg_order)),
factor(Plant, levels=rev(level_cereals_order)),
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Roman"
) + scale_fill_gradient(low = "white", high = "black")
cer_ubiquity_macroreg_LR.HM <- ggplot(cer_ubiquity_macroreg.LR, aes(
factor(Macroregion, levels=(level_macroreg_order)),
factor(Plant, levels=rev(level_cereals_order)),
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Late Roman"
) + scale_fill_gradient(low = "white", high = "black")
cer_ubiquity_macroreg_EMA.HM <- ggplot(cer_ubiquity_macroreg.EMA, aes(
factor(Macroregion, levels=(level_macroreg_order)),
factor(Plant, levels=rev(level_cereals_order)),
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Early Medieval"
) + scale_fill_gradient(low = "white", high = "black")
cer_ubiquity_macroreg_Ma.HM <- ggplot(cer_ubiquity_macroreg.Ma, aes(
factor(Macroregion, levels=(level_macroreg_order)),
factor(Plant, levels=rev(level_cereals_order)),
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Medieval"
) + scale_fill_gradient(low = "white", high = "black")
Cereals_Ubiquity_MacroReg_Patchwork <- (cer_ubiquity_macroreg_R.HM|cer_ubiquity_macroreg_LR.HM)/(cer_ubiquity_macroreg_EMA.HM|cer_ubiquity_macroreg_Ma.HM)
Cereals_Ubiquity_MacroReg_Patchwork + plot_annotation(
title = 'Cereals',
subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
caption='Note: Data was too scarce for Roman Central Italy and Medieval Southern Italy.'
)Show the code: plots
# Pulses plots ubiquity
puls_ubiquity_macroreg_R.HM <- ggplot(puls_ubiquity_macroreg.R, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="#cfcfcf", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Roman"
) + scale_fill_gradient(low = "white", high = "black")
puls_ubiquity_macroreg_LR.HM <- ggplot(puls_ubiquity_macroreg.LR, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="#ffffff", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Late Roman"
) + scale_fill_gradient(low = "white", high = "black")
puls_ubiquity_macroreg_EMA.HM <- ggplot(puls_ubiquity_macroreg.EMA, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Early Medieval"
) + scale_fill_gradient(low = "white", high = "black")
puls_ubiquity_macroreg_Ma.HM <- ggplot(puls_ubiquity_macroreg.Ma, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Medieval"
) + scale_fill_gradient(low = "white", high = "black")
Pulses_Ubiquity_MacroReg_Patchwork <- (puls_ubiquity_macroreg_R.HM|puls_ubiquity_macroreg_LR.HM)/(puls_ubiquity_macroreg_EMA.HM|puls_ubiquity_macroreg_Ma.HM)
Pulses_Ubiquity_MacroReg_Patchwork + plot_annotation(
title = 'Pulses',
subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
caption='Note: Data was too scarce for Roman Central Italy and Late Roman/Medieval Southern Italy.'
)Show the code: plots
# Fruits nuts plots
fnuts_ubiquity_macroreg_R.HM <- ggplot(fnuts_ubiquity_macroreg.R, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Roman"
) + scale_fill_gradient(low = "white", high = "black")
fnuts_ubiquity_macroreg_LR.HM <- ggplot(fnuts_ubiquity_macroreg.LR, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Late Roman"
) + scale_fill_gradient(low = "white", high = "black")
fnuts_ubiquity_macroreg_EMA.HM <- ggplot(fnuts_ubiquity_macroreg.EMA, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Early Medieval"
) + scale_fill_gradient(low = "white", high = "black")
fnuts_ubiquity_macroreg_Ma.HM <- ggplot(fnuts_ubiquity_macroreg.Ma, aes(
factor(Macroregion, levels=(level_macroreg_order)),
Plant,
fill=(Ubiquity)
)) +
geom_tile(colour="white") +
geom_text(aes(label = Ubiquity), colour="white", size=3)+
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "none",
axis.ticks = element_blank()
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Medieval"
) + scale_fill_gradient(low = "white", high = "black")
FrNuts_Ubiquity_MacroReg_Patchwork <- (fnuts_ubiquity_macroreg_R.HM|fnuts_ubiquity_macroreg_LR.HM)/(fnuts_ubiquity_macroreg_EMA.HM|fnuts_ubiquity_macroreg_Ma.HM)
FrNuts_Ubiquity_MacroReg_Patchwork + plot_annotation(
title = 'Fruits/Nuts',
subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
caption='Note: Data was too scarce for Roman Central Italy and Medieval Southern Italy.'
)5.2.1.1 Cereals
It is interesting to notice how in the Roman age, cereals are similarly ubiquitous in Southern and Northern Italy, although there are some exceptions (i.e. einkorn, rye, oats, proso millet) that can derive from the randomness of samples. Unfortunately, only three sites provided botanical samples for Roman Central Italy and the values have been omitted from the plot. These sites (from the Roman Peasant Project, Tuscany) only reported three kinds of cereal: common wheat, emmer, and barley. Similar ubiquity values for the two macroregions under assessment in the Roman age may suggest similar production patterns in the whole peninsula. In the Late Roman age, ubiquity data has been calculated for the three macroregions. Three crops are found on 62-75% of the Central Italian sites: common wheat, barley and emmer. Other cereals are present, but less ubiquitously. These three cultivations seem to be diffused in the south as well. Conversely, in Northern Italy common wheat and barley were important cultivations but competed with other cereals including millet, sorghum, and rye (now doubled in presence). The Early Medieval age seems to mark a shift in agricultural practices—cereals ubiquities vary more markedly in the three macroregions. In Southern Italy, common wheat and barley were still the predominant cereals. This is true for Central and Northern Italy, however in these regions other cereals are also widely present in a large number of sites. The samples from the Medieval age are fewer in number since the upper boundary of this project’s chronology is the 11th c. Despite the short chronology, it is possible to make some considerations. Medieval Centraly Italy relied heavily on common wheat, barley and emmer, with other cereals increasingly important. Barley is the most ubiquitous cereal in Northern Italy in this period, followed by common wheat, millets and sorghum.

5.2.1.2 Pulses
In the Roman Age, pulses are an important part of the diet and are cultivated both in Northern and Southern Italy. In the latter, vetch/broad beans are present in 22-32% of the samples, and lentils are present in 38% of the sites. In the Late Roman Age, broad beans are equally important in Central and Northern Italy, and peas are present in 50% of the Central Italian sites. In the Early Medieval Age, pulses are present in many Central Italian sites, especially blue/red peas, broad beans and other Fabaceae. Lentils and broad beans are also cultivated in almost half of the Northern Italian sites. The importance of pulses in Central Italy is confirmed by the 11th c. samples, where every specie is present in over 66% of the sites and Fabaceae and blue/red peas are found in every sample. Conversely, in Northern Italy broad bean is found in 66% of the sites.

5.2.1.3 Fruits and nuts
Olive and grape are two essential cultivations in the Italian peninsula. Olive pits, as can be expected, are more ubiquitous in Southern Italy, where in Roman times are present in >87% of the sites and in over 58% of the sites in the following chronologies1. Conversely, the grape is important in Central and Northern Italy in the Late Roman, Early Medieval and Medieval ages.

5.3 Richness and diversity
5.3.1 Richness and diversity in the Italian macroregions
Show the code: data preparation
# Species richness based on geographical features
# RELATIVE PROPORTIONS OF ARCHAEOBOT_VIZ QUERY EXPORT FROM THE DB
# (Condensed, without totals)
# Remove NAs
Df_Cond_Plants[is.na(Df_Cond_Plants)] <-0
# Generate a dataframe with the relative proportions and round the results
Df_Cond_Plants_Rel <- decostand(Df_Cond_Plants[11:50], method = "total")
Df_Cond_Plants_Rel <- round(Df_Cond_Plants_Rel, digits=2)
# Add more info to the dataframe
Df_Cond_Plants_Rel_Richness_Diversity <- data.frame(
"Geo" = Df_Cond_Plants$Geo,
"Chronology" = Df_Cond_Plants$Chronology,
"Type"= Df_Cond_Plants$Type,
"Specnumber" = specnumber(Df_Cond_Plants_Rel),
"Shannon Div" = diversity(Df_Cond_Plants_Rel),
Df_Cond_Plants_Rel
)
Df_Cond_Plants_Rel_withMacroregion <- data.frame("Geo" = Df_Cond_Plants$Geo,
"Chronology" = Df_Cond_Plants$Chronology,
"Type"= Df_Cond_Plants$Type,
"Macroregion" = Df_Cond_Plants$name_macroreg,
"Specnumber" = specnumber(Df_Cond_Plants_Rel[1:10]), #Only cereals
"Shannon Div" = diversity(Df_Cond_Plants_Rel[1:10]),
Df_Cond_Plants_Rel[1:10]
)
# Let's plot the diversity by macroregion
# Creating the dataframes for R and EMA age
# I know it's called "Plants" but it's actually just cereals
Df_Cond_Plants_Rel_withMacroregion.R <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "R")
Df_Cond_Plants_Rel_withMacroregion.LR <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "LR")
Df_Cond_Plants_Rel_withMacroregion.EMA <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "EMA")Show the code: plots
pal_RichnessvsGeo <- c("cadetblue3", "gold1", "bisque4", "palegreen4")
plot_RichnessMacroReg.R <- ggplot(Df_Cond_Plants_Rel_withMacroregion.R, aes(x = Macroregion, y = Specnumber, fill = Macroregion)) +
geom_violin(trim=FALSE) +
geom_boxplot(width=0.1, fill="white")+
scale_fill_manual(values = pal_RichnessvsGeo) +
geom_jitter(alpha=0.3)+
scale_x_discrete(labels = c("Central Italy \n (n = 3)", "Northern Italy \n (n = 39)", "Southern Italy \n (n=31)")) +
theme(legend.position = "none",
plot.background = element_rect("white"),
panel.background = element_rect("white"),
panel.grid = element_line("grey90"),
axis.line = element_line("gray25"),
axis.text = element_text(size = 12, color = "gray25"),
axis.title = element_text(color = "gray25"),
legend.text = element_text(size = 12)) +
labs(x = "Macroregion",
y = "Number of species per site",
title = "R - Cereal richness")
plot_RichnessMacroReg.EMA <- ggplot(Df_Cond_Plants_Rel_withMacroregion.EMA, aes(x = Macroregion, y = Specnumber, fill = Macroregion)) +
geom_violin(trim=FALSE) +
geom_boxplot(width=0.1, fill="white")+
scale_fill_manual(values = pal_RichnessvsGeo) +
geom_jitter(alpha=0.3)+
scale_x_discrete(labels = c("Central Italy \n (n = 10)", "Northern Italy \n (n = 36)", "Southern Italy \n (n=17)")) +
theme(legend.position = "none",
plot.background = element_rect("white"),
panel.background = element_rect("white"),
panel.grid = element_line("grey90"),
axis.line = element_line("gray25"),
axis.text = element_text(size = 12, color = "gray25"),
axis.title = element_text(color = "gray25"),
legend.text = element_text(size = 12)) +
labs(x = "Macroregion",
y = "Number of species per site",
title = "EMA - Cereal richness")Cereals share similar presence values in Roman Northern and Southern Italian sites (Figure 5.6). Central Italy reports higher values, although this is based only on three sites and hence it is not reliable. During the Early Middle Ages, Central Italy again is the richest in cereals, closely followed by Northern Italy. Interestingly, Southern Italy still reports values very close to the Roman age. A full list of the Southern Italian EMA sites is reported in Table 5.1.


| ID | Site | Region | Geography | Type | Culture/Influence |
|---|---|---|---|---|---|
| 98 | S. Maria in Cività, D85 | Molise | Hilltop | Urban | Lombard |
| 107 | S. Giovanni di Ruoti, Phase 3A | Basilicata | Mountain | Monastery | Lombard |
| 107 | S. Giovanni di Ruoti, Phase 3B | Basilicata | Mountain | Monastery | Lombard |
| 198 | Salapia, area botteghe, US 2475 | Puglia | Coast/Lagoon | Urban | Lombard |
| 198 | Salapia, area botteghe, US 2437 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, US 2054 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, US 2211-2217 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, 8th-9th c. | Puglia | Coast/Lagoon | Urban | Lombard |
| 196 | Faragola, wastepit 61 | Puglia | Plain | Rural, villa | Lombard |
| 196 | Faragola, wastepit 66 | Puglia | Plain | Rural, villa | Lombard |
| 234 | Colle Castellano, Phase 3-4 | Molise | Hill | Urban | Lombard |
| 177 | San Vincenzo al Volturno, kitchen area | Molise | Hill | Monastery | Lombard |
| 101 | Supersano, loc. Scorpo | Puglia | Plain | Rural | Byzantine |
| 250 | Apigliano, 9th-10th c., pits | Puglia | Plain | Rural | Byzantine |
| 250 | Apigliano, 10th-11th c., pits | Puglia | Plain | Rural | Byzantine |
| 196 | Faragola, granary A7 | Puglia | Plain | Rural, villa | Lombard |
| 196 | Faragola, granary A8 | Puglia | Plain | Rural, villa | Lombard |
5.4 Cereals regionality: testing the results
5.4.1 PERMANOVA
Permutational multivariate analysis of variance (PERMANOVA) is a non-parametric multivariate statistical test used to compare group of objects. By using measure space, the null hypothesis that the centroids and dispersion of groups are identical is tested. The null hypothesis is rejected if either the centroid or the spread of the objects differs between the groups. A prior calculation of the distance between any two objects included in the experiment is used to determine whether the test is valid or not2 (Anderson (2017)). In this context, the null hypothesis is that there is no regional difference in the cereals dataset, with cereals being evenly distributed across macroregions and chronologies.
The suggestion of an Early Medieval shift in cereal farming stated in Section 5.2.1 and Section 5.3.1 needs statistical support. Considering that data is not unimodal and that we are dealing with presence/absence analysis, the best choice is to use a non-parametric test as PERMANOVA on the early medieval botanical dataset. Prior to performing the test, it was necessary to pre-process data by:
Selecting all the cereals columns of the plant remains table, keeping some categorical variables:
Macroregion,Chronology,GeographyandType.Removing the empty rows (caused by the fact that some sites have seeds/fruits, but not cereals).
Transforming the raw counts into presence/absence, using the function
decostand()(method=pa) in theRpackagevegan(Oksanen et al. (2020)).
Show the code: Pre-processing
# Testing the results: Regionality in the dataset?
library(vegan)
set.seed(29)
# Pre processing: remove empty rows
# Note: The input table is the CONDENSED table without totals
# Selecting all the cereals columns of the plant remains table, keeping some categorical variables
cer_macroreg_ubiquity_transp.tot <- Df_Cond_Plants[c(4,5,6,7,11:19)]
# Selecting all rows with data (since we selected only with cereals, and some sites only had fruits/pulses we might have empty rows)
cer_macroreg_ubiquity_transp.tot <- cer_macroreg_ubiquity_transp.tot[rowSums(cer_macroreg_ubiquity_transp.tot[5:13])>0,]
# Assigning a column name "Macroregion"
colnames(cer_macroreg_ubiquity_transp.tot)[1] = "Macroregion"
# Selecting the Chronology of interest (EMA) and excluding Central Italy
cer_macroreg_ubiquity_transp.tot<- filter(cer_macroreg_ubiquity_transp.tot, Macroregion!="Central Italy" & Chronology=="EMA")
# dividing categorical and numerical columns
cer_macroreg_ubiquity_transp.categ <- cer_macroreg_ubiquity_transp.tot[1:4]
cer_macroreg_ubiquity_transp.data <- cer_macroreg_ubiquity_transp.tot[5:13]
# Converting the numerical columns into a presence/absence matrix (using method=pa)
cer_macroreg_ubiquity_transp.dist <- decostand(cer_macroreg_ubiquity_transp.data, method="pa", na.rm=TRUE)After the pre-processing, it was possible to run the PERMANOVA using the function adonis2() in the package vegan. The function creates a distance matrix and computes an analysis of variance on the matrix. The method chosen to calculate the distance matrix is the jaccard distance. The Jaccard distance (Kosub (2019)), based on the Jaccard similarity index, is a value of dissimilarity between sample sets. When compared to other dissimilarity indices, it is more appropriate for presence/absence analyses as it is not based on Euclidean distance.
Results of adonis2().
Show the code: adonis2()
cer_macroreg_ubiquity_transp.div <- adonis2(
cer_macroreg_ubiquity_transp.dist ~ Macroregion,
data = cer_macroreg_ubiquity_transp.categ,
permutations = 10000, method="jaccard"
)Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 10000
adonis2(formula = cer_macroreg_ubiquity_transp.dist ~ Macroregion, data = cer_macroreg_ubiquity_transp.categ, permutations = 10000, method = "jaccard")
Df SumOfSqs R2 F Pr(>F)
Macroregion 1 1.3024 0.13495 7.3322 9.999e-05 ***
Residual 47 8.3487 0.86505
Total 48 9.6512 1.00000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results of the PERMANOVA indicate that the variable Macroregion is highly significant, meaning that we can be 99.9% confident that it is a discriminant in the early medieval dataset.
After running the PERMANOVA, it is necessary to check the homogeneity of variances, to confirm the results (especially when dealing with small groups of data). The function betadisper() from the package vegan provides the distances of group samples from centroids. If the variation is even, the null hypothesis of no difference in dispersion between groups is accepted. To test the variation, it is possible to use the analysis of variance (ANOVA).
Show the code: Betadisper()
# We do not need to calculate the distance separately, but it will be useful later for the betadisper() function
# Distance dissimilarity matrix with the Jaccard method
cer_macroreg_ubiquity_transp.dist2 <- vegdist(cer_macroreg_ubiquity_transp.dist, method="jaccard", na.rm=TRUE)
# Betadisper: distances of group samples from centroids
cer_macroreg_ubiquity_transp.betadisper <- betadisper(cer_macroreg_ubiquity_transp.dist2, cer_macroreg_ubiquity_transp.categ$Macroregion)Results of anova() on the betadisper.
Show the code: ANOVA on betadisper()
# We will see that the ANOVA's p-value is not significant meaning that group dispersions are homogenous
#("Null hypothesis of no difference in dispersion between groups"; https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/betadisper).
anova(cer_macroreg_ubiquity_transp.betadisper) # This should not be significant!Analysis of Variance Table
Response: Distances
Df Sum Sq Mean Sq F value Pr(>F)
Groups 1 0.00211 0.0021112 0.0715 0.7903
Residuals 47 1.38696 0.0295098


The betadisper() graphs (Figure 5.7) show similar distances from the centroids for the categories Northern Italy and Southern Italy. In addition, the ANOVA on the betadisper() shows that the separation is not significant (p-value over the significance treshold), meaning that the groups dispersions are homogeneous. We can now be more confident of the PERMANOVA results and accept the difference between the two groups of sites under investigation. In other words, the Southern and Northern Italian group of sites are different during the Early Middle Ages.
Running the same tests on the Roman sites failed to separate the two groups of sites, confirming that there was not a major difference in the types of cereals cultivated during the Roman age between Northern and Southern Italy.
5.4.2 NCA
The Wasserstein distance (or earth’s mover distance) is a measure of distance between two probability distributions on a metric space.
In addition to statistically testing the separation between the Northern and Southern Italian early medieval cereals dataset, a dimensionality reduction algorithm has also been used to attempt to measure the distance between groups of sites both in the Roman and early Middle ages. For this task, a machine learning algorithm for metric learning has been chosen: the Neighborhood Component Analysis, from the Python package NeighborhoodComponentAnalysis (in sklearn.neighbors). A more in-depth explanation of the algorithm can be read in Section 4.6.2.4. To work with balanced group of samples, the group sizes have been arbitrarily set to 20 random samples (for each macroregion and chronology), allowing replacement (a sample can randomly be picked twice). The Python function (sample()) to select random samples is included in the random library. The NCA has been run with a reduction to only one dimension, using KDE plots to visualize the results. Setting the dimension to one allows easier calculations of distance. In Figure 5.8 (a), it is possible to see the NCA performed on the Roman cereals presence/absence dataset. As already pointed out, the PERMANOVA did not produce significant results for this dataset and the Wasserstein distance (calculated with the wasserstein_distance() function in the scipy library) is indeed shorter in the Roman dataset. For both chronologies there is an overlap in the curves, which is more considerable in the Roman age (indicating that the group of samples are more similar). The overlap for the EMA groups (Figure 5.8, b) is probably due to the fact that the presence of the noble grains is not by itself a ‘marker’ of Southern Italian sites—these grains are also very common in the North. The difference is that in the South noble grains are not cultivated in conjunction with other grains. The graph for the EMA chronology shows a clearer separation of the macroregional groups, with some minor overlaps. Moreover, the graph also displays variability in the Northern Italian dataset. The variability can also be assessed from the outliers in the boxplots in Figure 5.9.
Source code:
Show the code: Libraries
# Load Python libraries
#!pip install pandas
import pandas as pd
import os
#!pip install scikit-hubness
import random
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import (KNeighborsClassifier,
NeighborhoodComponentsAnalysis)
from scipy.stats import wasserstein_distance
import seaborn as sns
# Set seed
random.seed(10)Show the code: Selecting random samples, with replacement
#R
df_R_SI = df_R[df_R["Macroregion"]=="Southern Italy"].sample(20, random_state=7, replace="TRUE")
df_R_NI = df_R[df_R["Macroregion"]=="Northern Italy"].sample(20, random_state=7, replace="TRUE")
# Create a dataset with northern and southern Italy
df_R_merge = pd.concat([df_R_SI, df_R_NI], ignore_index=True)
data_R_byPlantGroup = df_R_merge.drop(['Chronology','Type', 'Macroregion', 'Weight'], axis=1)
labels_R_byPlantGroup = df_R_merge.iloc[:,2] # nrow, 0 for Chronology - nrow, 1 for Type - nrow,2 for Macroregion
#EMA
df_EMA_SI = df_EMA[df_EMA["Macroregion"]=="Southern Italy"].sample(20, random_state=7, replace="TRUE")
df_EMA_NI = df_EMA[df_EMA["Macroregion"]=="Northern Italy"].sample(20, random_state=7, replace="TRUE")
df_EMA_merge = pd.concat([df_EMA_SI, df_EMA_NI], ignore_index=True)
data_EMA_byPlantGroup = df_EMA_merge.drop(['Chronology','Type', 'Macroregion', 'Weight'], axis=1)
labels_EMA_byPlantGroup = df_EMA_merge.iloc[:,2] # nrow, 0 for Chronology - nrow, 1 for Type - nrow,2 for MacroregionShow the code: Performing the NCA
#R
nca_for_KDE_R_PlantGroup = NeighborhoodComponentsAnalysis(n_components =1, init="lda").fit(data_R_byPlantGroup, labels_R_byPlantGroup)
reduction_for_KDE_R_PlantGroup = nca_for_KDE_R_PlantGroup.transform(data_R_byPlantGroup)
df_R_merge["value"] = reduction_for_KDE_R_PlantGroup
#EMA
nca_for_KDE_EMA_PlantGroup = NeighborhoodComponentsAnalysis(n_components =1, init="lda").fit(data_EMA_byPlantGroup, labels_EMA_byPlantGroup)
reduction_for_KDE_EMA_PlantGroup = nca_for_KDE_EMA_PlantGroup.transform(data_EMA_byPlantGroup)
df_EMA_merge["value"] = reduction_for_KDE_EMA_PlantGroupShow the code: Wasserstein distance
# R
df_R_North = df_R_merge[df_R_merge["Macroregion"] == "Northern Italy"]
df_R_South = df_R_merge[df_R_merge["Macroregion"] == "Southern Italy"]
wasserstein_distance(df_R_South["value"], df_R_North["value"], u_weights=df_R_North["Weight"], v_weights=df_R_South["Weight"])
#EMA
df_EMA_North = df_EMA_merge[df_EMA_merge["Macroregion"] == "Northern Italy"]
df_EMA_South = df_EMA_merge[df_EMA_merge["Macroregion"] == "Southern Italy"]
wasserstein_distance(df_EMA_South["value"], df_EMA_North["value"], u_weights=df_EMA_North["Weight"], v_weights=df_EMA_South["Weight"])Plots
Show the code: Plotting the NCA
NCA_KDE_1D, ax = plt.subplots(1, 2, figsize=(10, 5), sharey=True, sharex=True)
sns.kdeplot(data=df_R_merge, x="value", ax=ax[0], hue="Macroregion", fill=True, alpha=.1, palette="colorblind", linewidth=1, legend=None).set(xlabel='NCA', title="(a) Roman age")
sns.kdeplot(data=df_EMA_merge, x="value", ax=ax[1], hue="Macroregion", fill=True, alpha=.1, palette="colorblind", linewidth=1).set(xlabel='NCA', title="(b) early Middle ages")
NCA_KDE_1D.text(0.08, 0.03, 'Weighted Wasserstein Distance: W = 67.64 \nPERMANOVA test: p>0.05', fontsize=10)
NCA_KDE_1D.text(0.68, 0.03, 'Weighted Wasserstein Distance: W = 200.65\nPERMANOVA test: p<0.001', fontsize=10)
plt.tight_layout()
plt.subplots_adjust(bottom=0.19)
plt.show()
